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ABSTRACT 

Of great concern among researchers is the 
effectiveness of holistic scoring, which is necessarily 
product-centered and decontextualized , in measuring writing quality, 
the mental processes necessary for writing, or teaching skill. The 
Committee on Teaching and Its Evaluation in Composition of the 
Conference on College Composition and Communication has made 
suggestions for shifting the focus of evaluation from the product to 
the process. Furthermore, it stresses the need and makes suggestions 
for viewing language teaching from a construct ivi st stance rather 
than from a reductionist position. The writing quality evaluations of 
much of the research in written composition, clearly more 
reductionist than constructivist , fail to take into account the 
purpose of the writer, the writer's audience, or the sociopolitical 
context of the writing act. Future research in writing should utilize 
the six evaluation instruments suggested by the committee that take 
into account the goals of both course and teacher, the background and 
preparation of students, and many other factors critical to the 
learning experience. Research in written composition should also 
consider language change over periods longer than a single academic 
term. Unfortunately, researchers today are caught between the 
expediency of experimental research and the completeness of 
naturalistic inquiry. One demands empirical inquiry based on the 
tenets of logical positivism, and the other requires a costly 
commitment to naturalistic inquiry based on the belief in a multiple 
reality. (HOD) 
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Experimental Research in Written Composition. [page 1] 



A recent article in College Composition and Communication 

raises serious questions about our "science consciousness" in teaching 

and research in composition"'" . Since the publication of Research 

in Written Composition nearly two decades ago, scores of classroom 

researchers have described and quantified writing samples, lexical 

choices, errors, T-units, clauses, sentences, and nearly everything 

else that is countable or measurable, in countless attempts to evaluate 

writing instruction, courses, and programs. Indeed, "empirical 

research" has become & paradigm for doctoral research. Yet, time after 

time, the results have been termed "not statistically significant," as 

2 

in my own doctoral research. Other times the experimental design 

was thought to be flawed in one or more ways. The reaction to such 

failure to discriminate among experimental outcomes is most commonly to 

call for more experimentation within this model, much in line with many 

recent doctoral studies. 

But more recent speculation on the transmission and 

measurement of literacy calls that paradigm for educational research 
3 

into question. The limits of what can be discovered and understood 

through narrowly empirical research on writing are now being defined in 

4 

more realistic terms, and it may be that a broader research 
paradigm will prove more fruitful. Specxf ically , the effectiveness of 
holistic scoring is now being questioned by some researchers while 
others are calling for research within a naturalistic paradigm. 

3 
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Experimental Research in Written Composition. ..[ page 2] 
The effectiveness of holistic scoring and T-unit analysis are 
questionable measures of writing quality changes, especially over 
periods of time as short as one semester.^ Grobe suggests that 
holistic scoring may be more influenced by essay length and correctness, 
of spelling than by syntactic complexity, and that the scoring may be 
more influenced by vocabulary than anything else,^ a suggestion 
corroborated by Neilson and Piche'.^ Writing samples of college 
students at two West Virginia colleges further support the notion that 
essay length may influence holistic scoring. Of three groups of 
freshman writers responding to similar writing assignments, students 
enrolled in basic writing wrote the shortest essays (mean length, 157 
words) and received the lowest holistic scores, with a mean score of 
1.9 on a scale of 1 to 6. Composition I students wrote essays with a 
mean ler3th of 272 words and received a mean holistic score of 3.2. The 
group receiving the highest mean holistic score, the Composition II 
students, also wrote the longest essays; they received a mean score of 
4.4 for essays with a mean length of 427 words. The between-groups 
differences in holistic scores and in essay lengths were both 

statistically significant, supporting the earlier findings of 

9 10 
Freedman, and Nold and Freedman th^it the length of student 

essays has a definite influence on holistic scores. Utilizing the 

metaphor of a recent articl e by Robert Gorrell , "'""^ scoring essay 

quality by a measure heavily inf lue; ced by length is about as useful as 

judging the quality of Mulligan stew by the size of the pot. 

But there is a concern greater than whether length, 

vocabulary, syntactic complexity, or spelling has the most influence on 

the holistic scores assigned by evaluators of writing sampleso That _ 
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Experimental Research in Written Composition, [page 3] 

concern is whether holistic scoring, necessarily product-centered and 

de-contextualized, is an effective measure of writing quality, 

effectiveness of instruction, or anything other than how well a writing 

12 

sample simulates an Idealized Text, i.e», whether such a 
product-centered method of evaluation can be used to evaluate the 
mental processes necessary for writing. Mulligan stew is judged by its 
product, without regard for the context of its creation; furthermore, 
the making of a stew does not require the higher cort.ical functions 
composing an essay requires, for stew-making is largely a linear, 
left-brain activity performed on a finite inventory of ingredients in a 
context of little variation. The composing process, on the other hand, 
requires higher cortical functions of both hemispheres: holistic 
cognitive processes involving a nearly infinite .number of possibilities 
in a context influenced by a wide number of factors. 

In discussing the inadequacies of the rating method employed 
for judging writing quality for a recent study of writing quality, all 
of the raters complained that the evaluation techniques required 
product-centered evaluation based on an artificial rubric that, while 
developed specifically for the essay topics by prominent composition 
researchers, was inadequate for evaluating what the papers really 
deserved, based on what the raters perceived as the students* 
intentions within the writing context. One rater dropped from the study 
because he could not rate Themes ad Products. Another rater was unable 
to adhere to the strict: rubric provided, so he failed to reach a level 
of inter-rater reliability acceptable for holistic scoring. The other 
two raters, experienced in blocking their subjective responses at 
holistic scoring sessions, attained a high inter-rater reliability by 
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adhering to the product-centered criteria of the rubric. 

The purpose of thorough training for holistic scoring sessions 

is to induce agreement among the raters, agreernent based on writing 

samples that represent the various scores allowed by th$ rubric, what 

Brannon and Knoblauch characterize as an Idealized Text. A further 

purpose is to prevent any consideration of context or environment of 

the writing act — a sort of context-stripping designed to further reduce 

rater variability. It is not possible, within what some have labeled 

13 

the agricultural-botany paradigm, to evaluate writing samples 
without first stripping the product from the context that produced it 
and then ignoring the writer's intention and objective. The 
product-centered evaluation of decontextualized writing samples is a 
fault of the research desij^n of the past several decades, brought into 
being by our sincere desire to be scientific in our research.. The 
result often is the superficial evaluation of surface detail variables, 
or, evaluations that don't evaluate what matters. The CCCC Committee 
on Teaching and Its Evaluation in Composition has made suggestions for 
shifting the perspective of our evaluations to the teaching process 
rather than the product, "''^ suggestions for viewing language 
teaching from a constructivist stance rather than from a reductionist 
position. 

In "A Holistic View of Language," Roger Shuy makes clear the 
distinction between the constructivist and the reductionist 
approaches to language. The reductionist view of language is "that 
learners learn best small things before large things and that by taking 
natural language apart and by cutting it into pieces, the learner can 
best benefit •'■^ The constructivist view (or holistic view), on the 
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Experimental Research in Written Composition. [page 5] 
other hand, "prefers to see the elemental parts within a meaningful 
whole, ""^^ integrating both language and sociolinguistic competence 
to achieve a comniunicative comp-f'^tence dependent upon both social and 
linguistic context for meaning. The writing quality evaluations of 
much of the research in written composition are clearly more 
reductionist than constructivist , failing to take into account the 
purpose of the writer, the writer's audience, and the socio-political 
context of the writing act under evaluation, which is precisely the 
point at which modern literary criticism and modern linguistics have 
consistently failed. "^^ Until the variables of learning and 
rhetorical context are accounted for, much of the experimental research 
in composition will lack the "power" to produce meaningful 

1 Q 

results. Perhaps further re-evaluations of holistic scoring and 

the popular quantitative measures are in order. 

Future research in writing instruction and future evaluations 

of writing instruction and courses may well utilize the six evaluation 

instruments suggested by the CCCC Committee on Teaching and Its 

19 

Evaluation in Composition. The instruments take into 

consideration the goals of both course and teachers, the background and 
preparation of the students, and many other factors critical to the 
context of the learning experience. Even when quantitative analyses 
are desirable the suggested evaluations instruments will aid the 
interpretation of the statistics by providing valuable information 
about the context that created the quantifiable data. The CCCC 
Committee's instrument is more naturalistic than experimental, and for 
that reason alone is more suitable for evaluation of teacher and 
program effectiveness because, like the naturalist, it "sees reality as 

7 



Experimental Research in Written Composition. [P^ge 6] 

context dependent rather than fixed and discoverable," as the 

20 

experimentalist views the world. Thus, the philosophical base for 

future research in written composition and for evaluation of teacher 

and program effectiveness should shift from positivism toward 

phenomenology, regardless of a recent admonition to utilize an 

21 

empirically-developed evaluation instrument. 

Finally, future research in written composition should be 

expanded in time to consider language change over periods longer than a 

single academic term. But longitudinal studies are costly in both time 

and money, and doctoral students, especially, feel the pinch of both. 

So, the world remains divided, and there are still those who insist on 

high production, as pointed out by James Kinney, who quotes a reviewer 

for Research in the Teaching of English as encouraging the 

abandonment of naturalistic inquiry because experimental research gets 
22 

faster results. 

James Kinney is not alone in his dismay, for doctoral students 
and. other young scholars are caught between the expediency of 
experimental research and the completeness of naturalistic inquiry. 
Janet Emig sees the dichotomy in current inquiry paradigms as a matter 
of perception and "how we elect to define what is distinctly human 
about human life."^"^ On the one hand is the expedience of 
de-contextualized experimental inquiry, empirical inquiry based on the 
tenets of logical positivism; and on the other hand is the high cost of 
a commitment to long-term naturalistic inquiry, naturalistic inquiry 
based on the existence of a multiple reality. Andrea Lunsford 
undertook a count of nouns in a large sample of writing by basic 
writers. Overwhelmed by her data, she wrote to Mina Shaughnessy, 
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shortly before the latter 's death. Shaughnessy wrote back, "Do your 

word counts, but remember to listen to what your students are 
.f24 

saying. 

Perhaps we need modes of inquiry that v/ill allow us to listen 
to what our students are saying as well as to collect data. What is 
knowable is knowable in a number of ways, and that dictates that 
neither experimental nor naturalistic inquiry into the writing and 
writing instruction processes will be abandoned; instead, each will 
complement the findings of the other. 
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